Scaredy-cats don’t succeed: behavioral traits predict problem-solving success in captive felidae

Behavioral traits can be determined from the consistency in an animal’s behaviors across time and situations. These behavioral traits may have been differentially selected in closely related species. Studying the structure of these traits across species within an order can inform a better understanding of the selection pressures under which behavior evolves. These adaptive traits are still expected to vary within individuals and might predict general cognitive capacities that facilitate survival, such as behavioral flexibility. We derived five facets (Flexible/Friendly, Fearful/Aggressive, Uninterested, Social/Playful, and Cautious) from behavioral trait assessments based on zookeeper surveys in 52 Felidae individuals representing thirteen species. We analyzed whether age, sex, species, and these facets predicted success in a multi access puzzle box–a measure of innovation. We found that Fearful/Aggressive and Cautious facets were negatively associated with success. This research provides the first test of the association between behavioral trait facets and innovation in a diverse group of captive felidae. Understanding the connection between behavioral traits and problem-solving can assist in ensuring the protection of diverse species in their natural habitats and ethical treatment in captivity.


INTRODUCTION
Personality generally refers to consistency in behavioral traits across time and situations. Applying the term to nonhuman animals has sometimes been considered controversial because of its anthropomorphic nature (Gosling & Vazire, 2002). Therefore, researchers often refer to behavioral traits, rather than personality, when working with nonhumans, and we will follow that convention. Behavioral traits will be used here to refer to individual differences in the expression of behavioral tendencies such as decision-making, risk taking, subjective wellbeing, and coping strategies (Dall, Houston & McNamara, 2004) that are consistent across time and context. We explore the association between these traits and the ability of individuals of various felid species to solve a task designed to measure innovation.
In captivity, individual behavioral traits affect animals' experiences and predict their adjustment and behaviors. Animal behavioral traits are primarily assessed in three ways: keeper assessments, behavioral coding, and preference tests (Watters & Powell, 2012). Keeper assessments involve familiar individuals rating the predefined traits of each subject on a scale based on their human-animal relationships (HAR), knowledge of the subject accumulated across time, mutual recognition between the human and animal, and the nature of interactions (positive or negative). Good HARs require the human and animal to have a history of positive interactions to allow successful behavioral predictions based on the caregiver's experience witnessing an animal's consistent likelihood to engage in different behaviors during the length of their relationship (Estep, 1992). In this manner, survey assessments also capture consistency in behavior across time and situations, but these assessments are subjective. First, raters can interpret trait definitions differently. For example, raters may define "flexible" differently based on their own experiences and biases. Whereas in the survey used here, flexible is defined as "adapts comfortably to change," change can be defined as a shift in exhibits, conspecifics, and/or zookeepers. Keepers may use different experiences as reference points for their ratings based on their own interactions with the animal. Similarly, one keeper may define "bold" as reacting aggressively to novelty, while another may see that as acting fearfully and score the individual low on boldness. In addition, keepers may allow their own biases to color their ratings; for example, they may be more likely to rate male animals high and to rate female animals low on boldness even when the animals behave similarly. Thus, the use of keeper assessments is strengthened when multiple keepers have built a relationship with the subject and reliability can be determined.
Another potential issue with keeper assessments is that the traits animals are rated for are often derived from top-down models that may not fit the target species. Thus, it is important to first understand the traits that best describe the variability in the behavior of members of the study species (Vonk & Eaton, 2018). Of the various possible forms of keeper assessment identified by Uher (2011), we adopted a lexical top-down approach, using behavioral trait descriptors from closely related species to fit the current subjects. Top-down assessments begin with an existing model of personality and attempt to assess the extent to which a given species exhibits traits derived from that model. These assessments seek to identify facets-sets of definable traits that correlate and can be grouped together under an umbrella term, akin to behavioral syndromes (Sih et al., 2004). Early research involving nonhuman animals focused on the popular five-factor model derived from humans (McCrae & Costa, 1987;Tupes & Christal, 1992), which includes the facets of openness, conscientiousness, neuroticism, extraversion, and agreeableness.
Factor analytic approaches have revealed that the structure of behavioral syndromes in nonhumans may differ from that established in humans. For example, in primates, behavioral syndromes are described by the combination of two or more of the following six facets: dominance, extraversion, dependability, emotional stability, agreeableness, and openness (Weiss, King & Figueredo, 2000). In canids, there is less agreement regarding which key traits compose the structure of canid behavioral syndromes. Domestic dogs exhibit a variety of traits that include extraversion, neuroticism, agreeableness, openness/ conscientiousness (Gosling & John, 1999), playfulness, curiosity/fearlessness, chaseproneness, sociability, and aggressiveness (Svartberg & Forkman, 2002).
Of particular relevance to the current study, Stanton, Sullivan & Fazio (2015) conducted an in-depth meta-analysis on the current Felidae literature, ultimately creating a standardized Felidae ethogram. Ethograms include the documentation of species-exhibited behaviors by knowledgeable individuals (Stanton, Sullivan & Fazio, 2015) and allow behavioral tracking of captive populations that can be informative for predicting reproduction and overall welfare (Clubb & Mason, 2003). Most relevant studies on felids have derived models consisting of the following six facets: active, aggressive, curious, dominant, sociable, and timid/fearful/tense (Gartner & Weiss, 2013). For instance, in cheetahs (Acinonyx jubatus), Wielebnowski (1999) documented three major behavioral facets-tense-fearful, excitable-vocal, and aggressive whereas Phillips et al. (2017) identified three facets-nervousness, adventurousness, and aggression using keeper observations. Thus, similar facets emerged in both samples of cheetahs from studies conducted 18 years apart. Gartner, Powell & Weiss (2014) measured behavioral traits in five felid species. Keepers were asked to rate the cats on the same behavioral traits for each species. Although slightly different facets emerged from a factor analysis including neuroticism, dominance, and impulsiveness in African lions, neuroticism, agreeableness/ openness, and dominance/impulsiveness in clouded leopards, neuroticism, impulsiveness/ openness, and dominance in snow leopards, dominance, impulsiveness, and neuroticism in domestic cats, and dominance, agreeableness, and self-control in Scottish wildcats, there were common facets that appeared to characterize Felidae in general (Gartner, Powell & Weiss, 2014). This study lays the foundation for the current research, which examines variability within and between felid species. The current research drew upon several previously used keeper assessments (Carlstead, Mellen & Kleiman, 1999;Gartner & Weiss, 2013;Gold & Maple, 1994;Martin-Wintle et al., 2017;Phillips & Peck, 2007;Wielebnowski, 1999;Wielebnowski et al., 2002) to incorporate the traits: active, anxious, calm, cautious, cooperative, curious, dominant, excitable, fearful, flexible, playful, smart, sociable, solitary, stereotypical, submissive, tense, vigilant, and uninterested. Having established the structure of behavioral traits in a taxa like felids, one can then examine whether the derived traits usefully predict behaviors in various contexts. Testing contexts, as defined by Freeman, Gosling & Schapiro (2011), incorporate the subjects' responses to a novel stimulus to elicit differing reactions from the subjects to document individual differences. The initial studies of Carlstead, Mellen & Kleiman (1999) and Powell & Svoke (2008) introduced the idea that observations of subjects' interactions with novel enrichment provides insight into their behavioral traits. Carlstead, Mellen & Kleiman (1999) paired keeper assessments with a novel object test and a novel conspecific scent test. Using a 52-trait and behavior assessment that they developed, keepers from different zoos were able to reliably differentiate black rhinoceros individuals (Diceros bicornis) based on their physical characteristics (e.g., sex, origin, and age) and six behavioral traits (e.g., olfactory behaviors, chasing/stereotypy/mouthing, fear, friendly to keeper, dominant, and patrolling). Similarly, Powell & Svoke (2008) evaluated giant pandas' (Ailuroproda melanoleuca) responses to ten novel enrichment items and their 23 trait and behavior assessment using keeper responses, allowing them to create individual behavioral profiles.
Studies have demonstrated that the novel-object test can be a reliable and valid behavioral trait tool in Felidae. Gartner & Powell (2012) used keeper assessments and coded behaviors in response to six novel objects to identify five dimensions-active/ vigilant, curious/playful, calm/self-assured, timid/anxious, and friendly to humansdifferentiating snow leopards (Panthera uncia) based on age and sex. Similarly, Phillips et al. (2017) examined four behavioral trait states in tigers (Panthera tigris), including aggression, fear, vigilance, and obedience; this time, using both keeper assessments and behaviors towards olfactory and physical enrichment. Ratings from behavioral trait assessments correlate with performance on novel object tests validating the use of behavioral trait ratings.
The current work extends the existing literature demonstrating that behaviors elicited by novel tasks are useful in validating zookeeper assessments of captive carnivore behavioral traits by assessing whether keeper assessments can predict performance on a novel problem-solving task for environmental and cognitive enrichment. Here, the multi-access puzzle box ( , found that behavioral measures of high persistence, high motor diversity/exploration diversity, high activity/working time, and low neophobia are associated with success on a MAB in carnivores. Behavioral trait facets similar to these behaviors are expected to predict performance in a task designed to measure behavioral flexibility. For example, traits such as 'Cautious' and 'Anxious' might relate to neophobia, whereas 'Playful' and 'Curious' might relate to exploration. There have been numerous efforts to relate behavioral traits to individual differences in cognition. For example, the expression of particular traits can influence cognitive abilities such as success or failure (Carere & Locurto, 2011). However, there is, as of yet, no clear unifying theory about the expected association between personality and cognition in nonhumans (Griffin, Guillette & Healy, 2015;Sih & Giudice, 2012).

Species and rater information
Subjects included 52 individuals, 30 males and 22 females, from 13 species (see Table 1). The age of the subjects ranged from 6 months to 23-years-old (M = 6.68, SD = 5.96). Raters include thirty-seven keepers who spent, on average, 2.2 years with subjects (SD = 2.19) from five locations: the Bergen County Zoo (BCZ) in Paramus, New Jersey, the Bronx Zoo (BZ) in Bronx, New York, The Creature Conservancy (TCC) in Ann Arbor, Michigan, the Oklahoma Zoo (OKC) in Oklahoma City, Oklahoma, and the Turtle Back Zoo (TBZ) in West Orange, New Jersey. Because keepers and subjects were housed at different institutions, the same keepers did not rate all subjects.

Carnivore behavior survey and procedure
To properly compare the ability of behavioral traits to predict success in the MAB across all felids, individuals were not assigned species-unique traits. Instead, individuals were assessed on the same traits to determine whether individual or species level variation better predicts success in this task. Gosling & John (1999) reviewed 19 factor analytic studies across 12 nonhuman species verifying considerable generality in non-related species. In an effort to examine behavioral differences in captive Felidae, Stanton, Sullivan & Fazio (2015) analyzed surveys of 30 species and 40 subfamilies to find that most behaviors identified in their study were similarly described and likely to apply to most felid species. In a similar review of 20 published studies, Gartner & Weiss (2013) found reasonable consistency of certain personality dimensions in felids. As such, we expected that the behaviors of the 13 species studied here would have similarly structured behavioral traits, Furthermore, all felids share morphological features such as binocular vision, flexible bodies with muscular limbs, protracting claws, and external ears with similar levels of  (Kitchener et al., 2010;Rothwell, 2003). However, it is important to note that it was not the goal of the current study to determine or compare the structure of behavioral syndromes in individual Felidae species. The twenty-seven-item behavioral trait survey was developed based on previous surveys (Feaver, Mendl & Bateson, 1986;Gartner & Powell, 2012;Stanton, Sullivan & Fazio, 2015;Wielebnowski, 1999), and included the trait "intelligent." However, we did not include scores on "intelligent" in our analyses as it is not clear that this should be considered a personality, rather than a cognitive trait, and, as such, the results predicting success on a problem-solving task could be circular. Each item in the survey included a specific description. For example, Active, was described as "moves about a lot" (see Table 2). Four traits, Aggressive, Fearful, Friendly, and Uninterested, were rated with regard to three contexts-overall, with novelties or environmental changes, and with humans. All traits were rated on an eight-point Likert scale, where 0 = Doesn't apply, 1 = Does not describe at all, 4 = Neutral, and 7 = Describes very well.
Each keeper was given the questionnaire individually and instructed not to consult others, so that their responses reflected their independent ratings of the individual subjects.
Keepers were asked to provide the following information about themselves: age, sex, and years of experience with big cats, the species, the individual, and their zoo. In most cases, keepers completed the questionnaires without knowledge of how individuals performed in the MAB although this was not the case for a subset of the cats tested at OKC (n = 10). Given their long-term experience with the individuals, zookeepers' ratings of felid traits are unlikely to have been impacted by the individual's performance in the small number of experimental trials in a single task that were observed by any given keeper. Furthermore, any bias resulting from witnessing test sessions was expected to be limited to intelligence and we opted not to include intelligence in our model. Therefore, to retain the largest and most inclusive sample, we opted not to remove data from these subjects.

Problem-solving task and procedure
Upon completion of the surveys by the keepers, a problem-solving task, which involved retrieving a food reward from a custom multi-access puzzle box (MAB) was presented. This task presents a simple and effective behavioral test for exploring innovation and has been used successfully in a variety of carnivores (O'Connor et al., 2022). All subjects were tested individually in their indoor, or outdoor, off-exhibit holding enclosures. The custom multi-access puzzle boxes were two molded Starboard boxes with stainless-steel frames measuring 0.6 m × 0.6 m × 0.6 m and 0.38 m × 0.38 m × 0.38 m. A food reward placed inside the box was accessible via three separate solutions: (1) Push Door Technique (see Fig. 1); (2) Pull Rope Technique (see Fig. 2); and (3) Pull Door Technique (see Fig. 3). Each solution was presented on a different side of the box. The puzzle box was cleaned and disinfected between different species' trials and subjects could not see each other during trials. Subjects underwent one trial per day. The trial began when the subject made physical contact with the puzzle box. Trials ended when the subject opened the puzzle box (a successful trial) or after 15 min elapsed without the subject opening the puzzle box (a failed trial). At the end of each trial, the subject was shifted to an adjacent enclosure according to the zoos' procedures. A subject either failed a condition, which was defined as failing to open the box in three out of five trials, or succeeded in a condition, which was defined as opening the box in three out of five trials. Subjects that succeeded moved on to the next condition. Subjects that failed did not advance to the next condition and testing was discontinued. Condition 1 (five trials): The reward was retrievable via any of the solutions; all three doors were unlocked at the start of the first trial. Once a subject achieved their first successful trial, the door that they opened remained unlocked and the other two doors were locked for the remainder of the first condition. Three successful trials out of a possible five advanced the subject to the next condition. Condition 2 (five trials): The remaining two unsolved doors were unlocked at the start of the first trial. Once a subject succeeded in opening an unlocked door, that door remained unlocked and the other two doors were locked for the remainder of the second condition. Three successful trials out of a possible five advanced the subject to the final condition. Condition 3 (five trials): Only the remaining unsolved door was unlocked, and the subject was given five trials in which to open it three times, ending testing.

Statistical analysis
Data were analyzed using SPSS v. 28 software for Macintosh. Results were considered significant at alpha level p < 0.05.
For individuals that had more than one keeper rating their behavioral trait, interrater reliabilities were calculated using the Intraclass Correlation Coefficients (ICC) where we examined consistency between multiple raters using the model for Case 1 described by Shrout & Fleiss (1979) where each subject was rated by a different set of randomly selected raters that did not rate all subjects. Subjects of the same species at the same facility may have been rated by the same raters but the raters differed by species and facility. Items deemed unreliable, defined as having an ICC of less than or equal to zero, were omitted from further analysis. The average of the keepers' ratings was then used when there were multiple raters for a subject. Parallel analysis (Horn, 1965;O'Connor, 2000) was used as additional confirmation of the number of facets to be extracted from the survey data and both the mean and percentile estimations of model fit supported a five-component model to best fit the data. Subsequently, a principal components analysis (PCA) was conducted using a varimax rotation and five factors were extracted to combine the reliable behavioral traits into behavioral facets. Traits were assigned to the facet where they had the highest loading. Traits with negative loadings were reverse scored and composite variables taking the average ratings for each loaded trait were created. Only loadings of 0.40 or greater were considered. Fig. 4 depicts the structure of the first three facets.
In their interactions with the MAB, individual carnivores were coded on their Success (0 = no solutions opened, 1 = success on at least one condition). At least two independent observers verified the classification of trials as successful. Additional data from this task will be reported elsewhere. We regressed success on to sex, age, and subfamilies (1 = Pantherinae, 2 = Felinae) in the first step of a hierarchical logistic regression model using the Wald Chi-square test to determine whether any of the predictors significantly contributed to the outcome. We entered the five behavioral trait facets derived from the PCA in the second step of the model. Independent samples t-tests were conducted to determine whether the subfamilies differed on any of the five facets.

Reduction to five facets
The parallel analysis and PCA reduced twenty-six of the behavioral traits to five factors with factor loadings ≥0.40. For these five factors, all PCA values were greater than the eigenvalues derived from the parallel analysis, validating their use (Konečná et al., 2012). Upon examination of the extracted factors, we assigned all traits with values 0.40 or greater (e.g., Gartner & Weiss, 2013) to factors for which they had the highest factor loadings. Thus, we created composite facets representing the following five conceptually coherent facets-Flexible/Friendly, Fearful/Aggressive, Social/Playful, Uninterested and Cautious (see Table 3). The facets Flexible/Friendly, Fearful/Aggressive, and Social/Playful aligned well with previous research with felids (Gartner & Weiss, 2013). The facet Cautious might also be seen as aligning well with Neurotic and Nervousness, which were extracted by prior studies (Gartner & Weiss, 2013;Phillips et al., 2017).

DISCUSSION
Innovation, as a component of behavioral flexibility, is critical for enabling animals to adapt to changing environments. Species and individuals differ in the extent to which they exhibit behavioral flexibility. Identifying behavioral traits that predict flexibility may facilitate captive husbandry strategies and especially conservation efforts. We examined whether captive carnivore behavioral traits predicted innovation, measured as success on a MAB. A twenty-seven-item keeper assessment survey reduced to five behavioral facets-Flexible/Friendly, Fearful/Aggressive, Uninterested, Social/Playful, and Cautious.
Within felids, the most robust behavioral facets from prior research are Sociable, Dominant, and Curious (Gartner & Weiss, 2013). Our PCA analysis with a greater diversity of species identified similar facets-our Social/Playful to their Sociable, our However, our study is important in suggesting that an individual's traits, rather than species differences, seem to predict success in the task. Evidence in other species supports this conclusion. Black rhinoceros individuals rated as more fearful also exhibited a longer latency to contact a novel object (Carlstead, Mellen & Kleiman, 1999) and rainbow trout (Onchorhyncus mykiss) rated as shy but given emboldening experiences became bolder and exhibited reduced latencies to contact a novel object (Frost et al., 2007). In the current study, individuals that were rated as more fearful or aggressive and individuals that were rated as more cautious were less likely to have success on the MAB. However, it is important to note that we tested success on only a single problem-solving task so future research is needed to test the generalizability of these associations. The behavioral traits assessed here were rated with reasonable reliability across keepers and predicted performance in the MAB, which measured innovation. Specifically, Fearful/ Aggressive and Cautious predicted lack of success on the MAB. Highly fearful and cautious animals should be less likely to attempt solutions to novel problems, so these results were expected. Although most of the keepers completed the assessments with no knowledge of the subjects' performance, some keepers for ten subjects may have been influenced in their ratings by observing a subset of the experimental trials. Because we did not include the trait "intelligence" in our analyses, which was the trait that should be most influenced by the subject's performance, and because no keeper observed all trials, and each keeper had extensive experience of the subjects beyond observing them participate in these trials, we are not concerned that this biased our results. Ideally, studies should have all raters complete their ratings before any of the tests are conducted. This was not always possible here based on the complications of arranging testing at multiple locations during a global pandemic.
Previous studies have identified behavioral traits and facets in felids, some in conjunction with a novel object test (Carlstead, Mellen & Kleiman, 1999;Gartner & Powell, 2012;Powell & Svoke, 2008;Razal, Pisacane & Miller, 2016). This is the first study to report an association of behavioral traits with success on a test of innovation. Diverse behaviors have been associated with problem-solving success in carnivores (Benson-Amram, Weldele & Holekamp, 2013;Benson-Amram et al., 2016;Daniels et al., 2019;Johnson-Ulrich, Johnson-Ulrich & Holekamp, 2018;O'Connor et al., 2022). Based on these findings, we expected Flexible/Friendly and Social/Playful to be predictive of success; however, this was not the case. We defined Flexible as "adapts comfortably to change," Friendly as "initiates proximity," and Curious as "readily explores new situations." Keeper ratings should be informed by knowledge of the animal's behavior in multiple contexts (e.g., shifting indoor and outdoor enclosures, proximity to zoo guests, etc.). Similarly, we defined Sociable as "seeks out companionship" and Playful as "initiates and easily joins in play." However, these terms may be related to age as younger cats were rated as more Social/Playful (r = −0.45, p < 0.001), or to socially housed species, of which some of these subjects are not. It is possible that performance in a single problem-solving task is not a representative measure of the subject's more general abilities. Because neophobia may limit interaction with the novel puzzle box (e.g., coyotes; Young, Touzot & Brummer, 2019), the facets of fearful and cautious may have overshadowed other behavioral traits in predicting success. These results support previous findings suggesting that motivation and object exploration may be better predictors of success compared to cognitive ability or inhibitory skills (Johnson-Ulrich, Johnson-Ulrich & Holekamp, 2018). This point is important given how frequently cognitive research is conducted to compare cognitive abilities between species (Vonk et al., 2021).
There are other nonsignificant results to note. Our research does not corroborate previous findings that age (e.g., Benson-Amram & Holekamp, 2012) or sex (e.g., Amici et al., 2019) predicted problem-solving. It is important to note that our measure of success was only a very cursory measure of performance in this task. The MAB allows for examination of multiple measures of cognition (e.g., trials to success, number of successful trials, number of solutions learned, latency to learn new solution) and behavior (e.g., number of behaviors performed, perseveration), but we examined only the simplest outcome here as a pilot test of how well behavioral traits could predict problem-solving success, which might be associated with adaptability and flexibility to change in novel environments. Thus, we would encourage future researchers to examine how individual differences predict variable success in tasks that might assess traits relevant for species' survival in the wild or ability to adapt in captivity.

CONCLUSIONS
Many studies of animal cognition have a low sample size, (Shaw & Schmelz, 2017). To our knowledge, this study includes data from the largest and most inclusive sample of felids compared to previous studies of felid personality and cognition. Across the thirteen Felidae species we assessed, a coherent behavioral trait structure was extracted involving five facets-Flexible/Friendly, Fearful/Aggressive, Uninterested, Social/Playful, and Cautious. These facets are echoed in the Felidae literature as reviewed by Gartner & Weiss (2013). We report the first demonstration that these traits predicted problem-solving success on a test of innovation. Two facets, Fearful/Aggressive and Cautious, significantly negatively predicted success in this task. This work should be considered preliminary, but we hope the promising results encourage future studies with larger sample sizes and further refinement of the behavioral traits measure. Felid behavioral trait research, in combination with cognitive testing, has practical applications for both captive welfare and wildlife conservation success.